Efficient Top-k Query Algorithms Using K-Skyband Partition
نویسندگان
چکیده
Efficient processing of top-k queries has become a classical research area. Fagin et al. proposed the “middleware cost” for a top-k query algorithm. In some scenario, there is no way to perform a random access, and Fagin et al. proposed NRA (No Random Access) algorithm for that. In this paper, we investigate the intrinsic relation between top-k queries and K-skyband queries. Based on that relation, we propose a novel algorithm DNRA (Dominate-NRA). The main idea of DNRA is to partition the original dataset into two sub-datasets depending on whether they belong to K-skyband or not. We prove that DNRA performs no more sorted accesses than NRA on any dataset. Furthermore, we partition the dataset into N sub-datasets (N is the number of objects in the dataset), and then we propose our algorithm ADNRA (Advanced-DNRA). The partition of the dataset is pre-computed, and we discuss two techniques to fulfill it. Extensive experiments show that our algorithms perform several orders of magnitude fewer accesses than NRA and that ADNRA performs significantly fewer accesses than DNRA on some datasets.
منابع مشابه
A Generic Framework for Top-k Pairs and Top-k Objects Queries over Sliding Windows
Top-k pairs and top-k objects queries have received significant attention by the research community. In this paper, we present the first approach to answer a broad class of top-k pairs and top-k objects queries over sliding windows. Our framework handles multiple top-k queries and each query is allowed to use a different scoring function, a different value of k and a different size of the slidi...
متن کاملProbabilistic k-Skyband Operator over Sliding Windows
Given a set of data elements D in a d-dimensional space, a k-skyband query reports the set of elements which are dominated by at most k − 1 other elements in D. k-skyband query is a fundamental query type in data analyzing as it keeps a minimum candidate set for all top-k ranking queries where the ranking functions are monotonic. In this paper, we study the problem of k-skyband over uncertain d...
متن کاملEffective Space Usage Estimation for Sliding-Window Skybands
Skyline query computes all the “best” elements which are not dominated by any other elements and thus is very important for decision-making applications. Recently, it is generalized to skyband query and a k-skyband query returns those elements dominated by no more than k, of other elements. To incorporate the skyband operator into the stream engine for monitoring skybands over sliding windows, ...
متن کاملReverse k-skyband query based on reuse technology
In this paper, we introduce a new approach to finish reverse k-skyband (RkSB) query which returns all the points in given dataset P whose dynamic k-skyband contains specific query object q. The main ideas include reuse technology and early stopping. The former save the information of node accesses during R-tree search into an auxiliary heap so that dramatically decreases the I/O cost and improv...
متن کاملParallel Algorithms for Top-k Query Processing
The general problem of answering top-k queries can be modeled using lists of objects sorted by their local scores. Fagin et al. proposed the “middleware cost” for a top-k query algorithm, and proposed the efficient sequential Threshold Algorithm (TA). However, since the size of the dataset can be incredible huge, the middleware cost of sequential TA may be intolerable. So, in this paper, we pro...
متن کامل